Equilibrium Policy Gradients for Spatiotemporal Planning

نویسنده

  • Mark Crowley
چکیده

In spatiotemporal planning, agents choose actions at multiple locations in space over some planning horizon to maximize their utility and satisfy various constraints. In forestry planning, for example, the problem is to choose actions for thousands of locations in the forest each year. The actions at each location could include harvesting trees, treating trees against disease and pests, or doing nothing. A utility model could place value on sale of forest products, ecosystem sustainability or employment levels, and could incorporate legal and logistical constraints such as avoiding large contiguous areas of clearcutting and managing road access. Planning requires a model of the dynamics. Existing simulators developed by forestry researchers can provide detailed models of the dynamics of a forest over time, but these simulators are often not designed for use in automated planning. This thesis presents spatiotemoral planning in terms of factored Markov decision processes. A policy gradient planning algorithm optimizes a stochastic spatial policy using existing simulators for dynamics. When a planning problem includes spatial interaction between locations, deciding on an action to carry out at one location requires considering the actions performed at other locations. This spatial interdependence is common in forestry and other environmental planning problems and makes policy representation and planning challenging. We define a spatial policy in terms of local policies defined as distributions over actions at one location conditioned upon actions at other locations. A policy gradient planning algorithm using this spatial policy is presented which uses Markov Chain Monte Carlo simulation to sample the

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Land use and land cover spatiotemporal dynamic pattern and predicting changes using integrated CA-Markov model

Analyzing the process of land use and cover changes during long periods of time and predicting the future changes is highly important and useful for the land use managers. In this study, the land use maps in the Ardabil plain in north-west part of Iran for four periods (1989, 1998, 2009 and 2013) are extracted and analyzed through remote sensing technique, using the land-sat satellite images. T...

متن کامل

Bayesian modeling and analysis for gradients in spatiotemporal processes.

Stochastic process models are widely employed for analyzing spatiotemporal datasets in various scientific disciplines including, but not limited to, environmental monitoring, ecological systems, forestry, hydrology, meteorology, and public health. After inferring on a spatiotemporal process for a given dataset, inferential interest may turn to estimating rates of change, or gradients, over spac...

متن کامل

Model-Free Imitation Learning with Policy Optimization

In imitation learning, an agent learns how to behave in an environment with an unknown cost function by mimicking expert demonstrations. Existing imitation learning algorithms typically involve solving a sequence of planning or reinforcement learning problems. Such algorithms are therefore not directly applicable to large, high-dimensional environments, and their performance can significantly d...

متن کامل

Government and Central Bank Interaction under Uncertainty: A Differential Games Approach

Abstract Today, debt stabilization in an uncertain environment is an important issue. In particular, the question how fiscal and monetary authorities should deal with this uncertainty is of much importance. Especially for some developing countries such as Iran, in which on average 60 percent of government revenues comes from oil, and consequently uncertainty about oil prices has a large effec...

متن کامل

Dynamic calcium movement inside cardiac sarcoplasmic reticulum during release.

RATIONALE Intra-sarcoplasmic reticulum (SR) free [Ca] ([Ca](SR)) provides the driving force for SR Ca release and is a key regulator of SR Ca release channel gating during normal SR Ca release or arrhythmogenic spontaneous Ca release events. However, little is known about [Ca](SR) spatiotemporal dynamics. OBJECTIVE To directly measure local [Ca](SR) with subsarcomeric spatiotemporal resolutio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011